We identify the task of measuring data to quantitatively characterize the composition of machine learning data and datasets. Similar to an object's height, width, and volume, data measurements quantify different attributes of data along common dimensions that support comparison. Several lines of research have proposed what we refer to as measurements, with differing terminology; we bring some of this work together, particularly in fields of computer vision and language, and build from it to motivate measuring data as a critical component of responsible AI development. Measuring data aids in systematically building and analyzing machine learning (ML) data towards specific goals and gaining better control of what modern ML systems will learn. We conclude with a discussion of the many avenues of future work, the limitations of data measurements, and how to leverage these measurement approaches in research and practice.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
培训最先进模型所需的基础设施变得过于昂贵,这使得培训此类模型仅适用于大型公司和机构。最近的工作提出了几种协作培训此类模型的方法,即通过将许多独立方的硬件汇总在一起,并通过Internet培训共享模型。在此演示中,我们合作培训了类似于Openai Dall-E的文本到图像变压器。我们邀请观众加入正在进行的训练运行,向他们展示有关如何使用可用硬件贡献的说明。我们解释了如何应对与此类训练运行相关的工程挑战(缓慢的沟通,有限的内存,设备之间的性能不均和安全问题),并讨论了观众如何设置协作培训。最后,我们表明所得模型在许多提示上生成了合理质量的图像。
translated by 谷歌翻译
通常通过过去的选择来告知机器学习中的评估,例如要使用哪些数据集或指标。该标准化可以使用排行榜对平等基础进行比较,但是随着出现更好的替代方案,评估选择变得不佳。这个问题在自然语言生成中尤其相关,该语言需要不断改善的数据集,指标和人类评估以提出确定性的主张。为了使遵循最佳模型评估实践更加容易,我们介绍了GEMV2。新版本的一代,评估和指标基准为数据集,模型和指标开发人员提供了模块化基础架构,以使彼此受益。GEMV2支持40种记录的数据集中51种语言。所有数据集的模型都可以在线评估,我们的交互式数据卡创建和渲染工具使得在Living Benchmark中添加新数据集变得更加容易。
translated by 谷歌翻译
现代深度学习应用程序需要越来越多地计算培训最先进的模型。为了解决这一需求,大型企业和机构使用专用的高性能计算集群,其建筑和维护既昂贵又远远超出大多数组织的预算。结果,一些研究方向成为几个大型工业甚至更少的学术作用者的独家领域。为了减轻这种差异,较小的团体可以汇集他们的计算资源并运行有利于所有参与者的协作实验。这种范式称为网格或志愿者计算,在众多科学领域看到了成功的应用。然而,由于高延迟,不对称带宽以及志愿者计算独特的几个挑战,使用这种用于机器学习的方法是困难的。在这项工作中,我们仔细分析了这些约束,并提出了一种专门用于协作培训的新型算法框架。我们展示了我们在现实条件下的SWAV和Albert预先预价的方法的有效性,并在成本的一小部分中实现了与传统设置相当的性能。最后,我们提供了一份成功的协作语言模型预先追溯的详细报告,有40名参与者。
translated by 谷歌翻译
Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. Transformers is an open-source library with the goal of opening up these advances to the wider machine learning community. The library consists of carefully engineered stateof-the art Transformer architectures under a unified API. Backing this library is a curated collection of pretrained models made by and available for the community. Transformers is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments. The library is available at https://github.com/ huggingface/transformers.
translated by 谷歌翻译
The most widely studied explainable AI (XAI) approaches are unsound. This is the case with well-known model-agnostic explanation approaches, and it is also the case with approaches based on saliency maps. One solution is to consider intrinsic interpretability, which does not exhibit the drawback of unsoundness. Unfortunately, intrinsic interpretability can display unwieldy explanation redundancy. Formal explainability represents the alternative to these non-rigorous approaches, with one example being PI-explanations. Unfortunately, PI-explanations also exhibit important drawbacks, the most visible of which is arguably their size. Recently, it has been observed that the (absolute) rigor of PI-explanations can be traded off for a smaller explanation size, by computing the so-called relevant sets. Given some positive {\delta}, a set S of features is {\delta}-relevant if, when the features in S are fixed, the probability of getting the target class exceeds {\delta}. However, even for very simple classifiers, the complexity of computing relevant sets of features is prohibitive, with the decision problem being NPPP-complete for circuit-based classifiers. In contrast with earlier negative results, this paper investigates practical approaches for computing relevant sets for a number of widely used classifiers that include Decision Trees (DTs), Naive Bayes Classifiers (NBCs), and several families of classifiers obtained from propositional languages. Moreover, the paper shows that, in practice, and for these families of classifiers, relevant sets are easy to compute. Furthermore, the experiments confirm that succinct sets of relevant features can be obtained for the families of classifiers considered.
translated by 谷歌翻译
推荐系统被证明是提取与用户相关的内容帮助用户进行日常活动的宝贵工具(例如,找到相关的访问地点,要消费的内容,要购买的商品)。但是,为了有效,这些系统需要收集和分析大量个人数据(例如,位置检查,电影评分,点击率等),这使用户面临许多隐私威胁。在这种情况下,基于联合学习(FL)的推荐系统似乎是一个有前途的解决方案,可以在计算准确的建议的同时将个人数据保存在用户设备上时,是一个有前途的解决方案。但是,FL,因此基于FL的推荐系统,依靠中央服务器,除了容易受到攻击外,还可以遇到可伸缩性问题。为了解决这个问题,我们提出了基于八卦学习原理的分散推荐系统Pepper。在胡椒中,用户八卦模型更新并不同步。 Pepper的核心位于两个关键组成部分:一个个性化的同行采样协议,该协议保存在每个节点附近,这是与前者具有相似兴趣的节点的一部分,以及一个简单而有效的模型汇总功能,该功能构建了一个模型更适合每个用户。通过在三个实施两个用例的实验实验中进行实验:位置入住建议和电影推荐,我们证明我们的解决方案比其他分散的解决方案快42%收敛于42%与分散的竞争对手相比,长时间性能的命中率和高达21%的速度提高了21%。
translated by 谷歌翻译
尽管使用模型不合时宜的AI(XAI)观察到了进展,但模型 - 敏锐的XAI的情况可能会产生错误的解释。一种替代方法是所谓的XAI正式方法,其中包括PI解释。不幸的是,PI解释也表现出重要的缺点,其中最明显的是它们的大小。相关功能的计算可以用解释中的功能数量进行概率精度。但是,即使对于非常简单的分类器,相关特征的计算集的复杂性也是令人难以置信的。本文研究了幼稚贝叶斯分类器(NBC)相关集的计算,并表明这些集合在实践中很容易计算。此外,实验证实可以使用NBC获得简洁的相关特征集。
translated by 谷歌翻译
理解主要视觉皮层V1中感觉诱导的皮质模式,既是生理动机的重要挑战,又是提高我们对人类感知和视觉组织的理解。在这项工作中,当皮质活性由几何视觉幻觉样刺激驱动时,我们专注于视觉皮层中的图案形成。特别是,我们提出了一种理论框架,用于感官诱导的幻觉,该框架使人们能够再现新的心理物理结果,例如Mackay效应(Nature,1957)以及Billock and Tsou经验(PNAS,2007年)。
translated by 谷歌翻译